Boyer - Moore String Matching over Ziv -
نویسندگان
چکیده
We present a Boyer-Moore approach to string matching over LZ78 and LZW compressed text. The key idea is that, despite that we cannot exactly choose which text characters to inspect, we can still use the characters explicitly represented in those formats to shift the pattern in the text. We present a basic approach and more advanced ones. Despite that the theoretical average complexity does not improve because still all the symbols in the compressed text have to be scanned, we show experimentally that speedups of up to 30% over the fastest previous approaches are obtained. Moreover, we show that using an encoding method that sacriices some compression ratio our method is twice as fast as decompressing plus searching using the best available algorithms.
منابع مشابه
LZgrep: a Boyer-Moore string matching tool for Ziv-Lempel compressed text
We present a Boyer-Moore approach to string matching over LZ78 and LZW compressed text. The idea is to search the text directly in compressed form instead of decompressing and then searching it. We modify the Boyer-Moore approach so as to skip text using the characters explicitly represented in the LZ78/LZW formats, modifying the basic technique where the algorithm can choose which characters t...
متن کاملBoyer-Moore String Matching over Ziv-Lempel Compressed Text
We present a Boyer-Moore approach to string matching over LZ78 and LZW compressed text. The key idea is that, despite that we cannot exactly choose which text characters to inspect, we can still use the characters explicitly represented in those formats to shift the pattern in the text. We present a basic approach and more advanced ones. Despite that the theoretical average complexity does not ...
متن کاملString Matching Rules Used by Variants of Boyer-moore Algorithm
String matching problem is widely studied problem in computer science, mainly due to its large applications used in various fields. In this regards many string matching algorithms have been proposed. Boyer-Moore is most popular algorithm. Hence, maximum variants are proposed from Boyer-Moore (BM) algorithm. This paper addresses the variant of Boyer-Moore algorithm for finding the occurrences of...
متن کاملApproximate Boyer-Moore String Matching
The Boyer-Moore idea applied in exact string matching is generalized to approximate string matching. Two versions of the problem are considered. The k mismatches problem is to find all approximate occurrences of a pattern string (length m) in a text string (length n) with at most k mismatches. Our generalized Boyer-Moore algorithm is shown (under a mild independence assumption) to solve the pro...
متن کاملLongest - match String Searching for Ziv – Lempel Compression timothy
SUMMARY Ziv–Lempel coding is currently one of the more practical data compression schemes. It operates by replacing a substring of a text with a pointer to its longest previous occurrence in the input, for each coding step. Decoding a compressed file is very fast, but encoding involves searching at each coding step to find the longest match for the next few characters. This paper presents eight...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2008